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I. mifiCCDCTION 



A. SIIICE AND DATA IICTIONAEX 

Tie SfLICE (Stock feint Logistics Integrated 
Communicaticn Envirorient) concept comes as a result cf the 
always crowing demards of the U,S Navy for automated data 
processing [Bef. 1 ] ard inventory control at various pcints. 
A design and implemertation strategy is necessary based in 
distrituted architecture for a local area network (LAN). 

SfllCE is designed to increase ADP facilities cf the 
existing Navy stock point and inventory control pcirt. 
Because tie current Uniferm Automated Data Processing 
System-Stock Points cannot support the growing r eguireocents 
for autemated data processing (ALP) without a total rede- 
sign, an effort has teen undertaken to improve the system in 
the short and long term £Ref. 1]. Two major objectives are 
tehind tie SfLICE development; 

1. Ic increase CEI display terminals so users can access 
interactively the system*s data base. 





2. Ic standardiae 


the var 


ious 


current 


interfaces 


across 


the 


62 supply sites. 














Tie design approach 


first 


starts from 


the design 


i r g c f 


the 


logical or virtual 


Local 


Area 


Network 


(LAN) , by 


s peci- 



fying all tie functicnal modules, their character istics , and 
the cc mil uni cation prctocols without focusing on the hardware 
charac teristics. A later phase of the SfLICE project will 
anticipate the mapping of the virtual LAN reguir exerts onto 
a physical local netwerk- 
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lfc€ fcllowicg functional modules are involved in the 
develcfiient of the system. 

- Iccal communications (LC) 

- National communications (NC) 

- Ircnt-End processing (PEI) 

- lerniial manageaent (IM) 

- Data iase manageiient (DEW) 

- Session services (SS) 

- feripheral management (ff?) 

- Bescurce allocation (BA) 

Ihis IAN design provides for distributed control tut 
does net provide for the distribution of data bases within a 
LAN. Tte data bases of the SBIICE system are geographioally 
distributed over a wide area and for the purpose of nain- 
taining the integrity of the system, the data base functions 
are centralized within each IAN. A DBMS module for the 
system must at least provide dictionary, integrity, 
recovery, guery language, and security features as well as 
compatibility with existing CCEOl programs. 

Ihe functions of the DBM module would be: 

Catalog, to maintain a catalog of file names and 
status (name, open or closed, size, physical address of 
file , physic al address of index, application used in, date 
entered into system, expiration date if any, location of 
backup copy, format, access restrictions) . 

- Operations, under a menu selection scheme to perform 
various furctions (retrieve and display a record, update 
specified fields of a record, delete a record, insert a 
record, print a file, print a record or specified fields of 
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a record, arswer specified gueries and display and print the 
results) . 

- Eicticnary for defining and characterizing the data 
elements. The dictionary must be integrated with the EEKS. 
Ihis will contribute to data integrity and consistercy 
throughout the system and should also be of great assistance 
in designing report formats. 

With this improved design it is believed that the SEIICE 
system will provide economical and responsive support capa- 
bilities among the 62 different geographical locations, each 
having a different mix of application and terminal 
requirements. 

Ihe 6I1ICE functional design approach suggests devel- 
oping several functional modules, distributed in minicom- 
puters throughout the IAN with the necessary communicaticns 
to support them [fief. 2]. Ihis design provides for higher 
system availability than the centralized approach since 
functional modules can be moved from one physical node to 
another without changing their logical addresses [Eef. 3]. 
At the time there exist no exact methods for designing 
distributed systems ard so an objective of the NfS research 
program for SPLICE is to advance knowledge about distributed 
systems and to increase understanding of how distributed 
systems must be designed in order to operate effectively. 

Eistributed systems have problems associated with their 
design that need soiuticns in particular areas [Ref. 4 pp 
2]. Ihe distributed system must provide the ability for the 
user to communicate and access information across the 62 
local networks interconnected by the Defense Data Network 
(DDN) . It must be possible for the user at Naval Supply 
Center (NSC) Oakland to access the Inventory Control Joint 
(ICP) database at Mechanicsburg in the same way as the local 
database at Oakland [Bef. 4 ]. 
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lie data dictionaiy must frovide support to the 
oni^uelj raming acd identifying objects in the 
SPLICI system. In the case cf a message which is 
to another local network, tie dictionary can be 
obtain the physical destination address with the 



above by 
overall 
destined 
used to 
help of 



Data Directory 




Broadcast message 



Figure 1, 1 Network Services Directory and Dicticnary. 
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Sessicn Services module (Figure 1.1) . For od ject i 
and addressing and scitware maintenance, the data dicti 
can telf stcrinc all the name- to-address mapfing 
routing information. The data dictionary can alsc he 
to specify task reguire ments for the user ter 
processes. The data dictionary in a distributed envirc 
will cooperate closely with the session services m 
which provides assistance to tie user terminal process 



carrying c 


ut their 


tasks. 


Thus 


a 


distributed oper 


system must 


provide. 


in a 


ddition 


to 


other functions. 


ability tc 


access 


ef fee 


tively 


the 


dicticnar y/dire 



system (Figure 1.2 from Eef . 4) . 

Major systems of the SPIICF application envircnien 
the Integrated Disbursement and Accounting (IDA ) , Auto 
frocurement and Date Fntry (AEADE) , Uniform Automated 
Processing System-Stock Points (UADPS-SP) , and Icgi 
Data Sjstem Trident IIS. Fach of the above systems ha 
own elements, files, programs, transactions, user 
reports £Bef. 4]. 

It is vital for the system to manage all the resc 
efficiently and the distributed environment makes thi 
more difficult. A data dictionary/ directory system 
seems tc be one approach to data design and managing pr 
solution. For the centralized database environment 
aspects are emphasized [ Eef . S^. 

-Tie software interfaces between the D/D system 
other software packages 

-The convert functions cf the D/D system 

-The environmental dependency between the D/D syste 
a database management system (IBMS) . 

Fcr tie distributed database environment, as in the 
cf SPIICF, there must be extensions to the centralized 
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Figure 1,2 layei€(3 Operating System Design (Bef- 4), 



additicnal software interfaces required, and the use cf the 
E/D as a distributed database. 
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Z- CEJICHT/ES Of THEEIS 

Ihe SfllCE project at the Naval Postgradaate Scbcol 
(NPS) tabes the approach of designing the logical or virtual 
local Area Network (IAN) first, specifying all the func- 
tional modules, their characteristics and the communication 
protocols, rather than focusing on the hardware characteris- 
tics of IAN first £Bef» 1] developing alternatives for 
SPLICE local Area Networks. After providing a functional 
specification for a distributed operating system, user 
interface specifications are provided, where the 
diet ion ary/ cirec tory system (DCS) constitutes a major compo- 
nent £Bef. 4] and its function is to provide support for 
raminc and identifyirc objects in SPLICE. 

lie objectives cf this thesis are to investigate the 
area cf data dictionary/dir ectery systems (DDS) , to outline 
the advantages/ disadvantages of these systems , and to 
present the underlying ideas. Also, to pay special attention 
to tie distributed environment, and to introduce the 
benefits fer the SPLICE system from using a dictionary/ 
directory system. Finally an attempt will be made to intro- 
duce tie interface requirements between • a data 
dictionary/ directory system for the SPLICE, and the neigh- 
icrinc modules. 
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II • DICaiC NAR S/IiJHECIOEY SYSIEMS 



A. GiSZBAI REVIEW 



A cata cicticnaij is a description cf data rascurc 
contains icth machire-readatle and human-readable de 
tions of the database tables, their attributes, inte 
tionships, and semantics. It is usually not very larg 
it has a very rich structure. Most systems have 
dicticnary facility wiich stores metadata about the da 
aside frcm the datatase itself. The data dicticna 
often built cn top of the OEMS as a special applicatic 
a special cata definition language. 

Thus a EDS is a set of one or more databases cent 
data about an organization* s information resources, 
resources can be retrieved and analyzed using standard 
base aanagenent system (DBMS) capabilities. The ccnc 
a data dictionary system has existed in the data prcc 
industry fer a number of years. Use of such a 
consists, basically, cf an attempt to capture and stor 
central location definitions cf data and other entr 
interest [fief. 6]. Tbe principles of such a system ar 



-Erevide for better data control 



-Erevide for better documentation 



-Improve the quality of the systems that ace bu 
terms of user functionality and satisfaction and 
laintainabiiity. 

The data dictionary helps to capture and documen 
elemerts, their definitions and some of their descr 



es. It 
scrip- 
rrela- 
e , but 
a data 
tabase 
r y is 
n with 

aining 
These 
data- 
ept cf 
essing 
system 
e in a 
i es cf 
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ilt in 
system 

t data 
iptive 
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attrilutes. It alsc provides for logioal— groof-in^ cf data 
elemerts dtring the process cf gathering reguiremerits to 
huild a rew system, lie data element dictionary provides the 
vocabulary that can le used tetween the systems analyst and 
the etd-rser [fief. 63* 

Next in the spectrum of usage the DDS help is twofold, 
lirst if the data dictionary is available it can be extended 
to include infornaticr of hew and by whom the data elements 
can te used. Thus a dictionary can be used to store the 
defiritiens of data elements and the definitions cf ctier 
data ccrstructs (records, files), the definitions of 
processes (programs or manual processes) , and definitions of 
data users (indi viduals, organizations) . The Seco nd trend 
that centributed to this extended usage of a dictionary 
system was the gradual migration away from the use cf tradi- 
tional files toward the concept of a central, integrated 
catahase distributed across the DON but centralized within 
each IlN, under the control of a database management system. 

lie prctlem cf duplication cf data (data redundancy) can 
he solved inside each IAN tut another mechanism must te 
provided in order to solve that problem across the DEN. 
Ihis prctlem must be examined carefully and that mechanism 
must provide for eccromy because sometimes data redundancy 
may te mere cost-efficient than the freguent use of ICN. 

lie ateve is vital for system design because in the 
SPLICI environment, data are to be shared not only by 
different systems, tut alsc ly a wide range cf users. Ihe 
tasic concept of a EEMS is tc provide a centrally located 
set cf definitions cf data within each LAN that is to be 
shared in order to assure that different users will access 
commer data with a set of consistent definitions. 

Ihe EDS acts as a repository of ail definitive informa- 
tion abort the database such as characteristics, relation- 
ships, and access authorizations. These databases, as 
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in 



implied ty the term 'Icgically ' he*^hysicaiiy‘' stored 

diverse Iccations within each LAN but are logically linked 
via ccmmunications and the EDS. 

lie data dictionaiy system located in a node within each 
IAN can be used to provide the above definitions and thus 
the leguired data corsistency. 

Separating the data dictionary from the database raises 
two prcblems [Ref. 7[. 

-Ihe dictionary and data base may disagree with cne 
another unless ore interface has control of both functions 

-Having a separate data dictionary implies having a 
separate language for the definition and manipulation cr the 
dictionary database. 

Dsers who define tables and other objects (case of 
systea-E) are encouraged to include English text to describe 
the meanings of the objects. later other users can retrieve 
attribute tables with certain attributes or can browse among 
the descriptions of defined tables, if they are so author- 
ized. A user later can modify these entries zo change the 
attributes of an object. 



£- MAiAGEHENT Of IHECBMATICN EESOOBCES 



Information resourse ma 
that attempts to solve a 
system life cycle ir an in 
Ihe data dictionary system 
this area. 

In the case of SEIICE t 
in providing a dccumen 
resources, a cortrol mecha 
of new information resour 
independence. 



nagement (IBM) is a methodology 
set of problems related to the 
tegrated and coordinated manner, 
will play an important rcle in 

he EDS can play an important rcle 
ted inventory of information 
nism for the analysis ard design 
ces and the necessary resource 
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A data dictionary can be used as a fowerful tool (net as 
a sclcticn) that can aid in the solution to various frctlems 
such as the inventory control, report production, proper 
routing cf data, preper routing of reguests, data consis- 
tency, security, etc- 

linally the dictionary system project is in fact an 
Inforiaticn Resourse tanagerent (IHM) i project. Ihe SPLICE 
system possesses much valuable data that has teen generated, 
collected, and stored in an automatic and 'formated’ state. 
Dtilizaticn of any class of data involves one cr mere 
processes. Ihese are [Bef, 6] 

- Cclle cti on : It is a process that tends to be expen- 
sive as tie cost of identification and recording {including 
input tc an automated system, as necessary) can be high. 



“ f icce ssin g; Tie data collected is generally 'managed' 
in some fashion before and/or after being stored. In the 
case of automated data, this occurs through the use of 
computer programs. 

Stor age ; The repository of data and information 
termed a "data base". 



Ret ri eva l : U 
technique being used, 
cr tc he modified. 



ing the knowledge 
data are retrieved 



about t he storage 
to answer questions 



- Communications; A communication line is needed to 
connect the user terminal with the place where the 
dictionary resides. 



ilnfcrmation Rescurse Management is whatever policy, 
action, cr procedure concerning information (both automated 
and nen-autemated) wbich management establishes that serves 
the overall current and future needs cf the system. Suen 
policies, etc. would include considerations of availability, 
timeliness, accuracy, integrity, privacy^ security, audit- 
ability, ownership, use, and cost eff ectiveness £Re£. 6]. 
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Ihe environiDeiit ix which the above processes take place 
is ccupcsed of : 

- Eata inf oxnat ioB . Bepresents the core of the 
entire irfoimation processing spectrum. 

- Ihe u sers in tie s yst em. It is the personnel involved 
with the s;ystem. These are users of data and other informa- 
tion ccmpcnents. 

Ph ysical facilities . Computer hardware and ether 
physical devices used in data processing. 

- Proce ssing facilities . These are all the activities 
which take place in tie use of physical facilities. 

Su pport facili ties . All the services which are 
required ty users of cata as well as personnel whose respon- 
sibilities are primarily in the information systems area. 

Each of the above components is refered as an 
I nf or ga t ier Reso urc e and the computer system must provide 
for an integrated ard coordinated manner to manage the 
entire ixfermation resource of the SPLICE system and the 
data cictiorary has tc play a major role in conjunction with 
the database management module. 

C. SDPPCBI OF SISTEb LIFE CYCIE 

In this section, we present some highlights of hew the 
data dictionary supports the main steps of system 
development . 

Ibe waterfall model of the software life cycle £Sef. 14] 
consists of the following stages: system feasibility, 
requirements specification, product design, detail design, 
coding^ integration, implementation, operations and mainte- 
nance. Cf course there are also other models of a software 
life cycle but basically the functions of a DOS are the same 
in whatever model we consider. 
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Eurirg the systei's fe a.sihility stage the DDS can te 
used tor net data element ccilection and to avoid redurdan- 
cies and inconsistencies. fllsc the DDS can contain a 
description cf processes that are already available and to 
help in assessing the true magnitude of the proposed task. 
Eurinc the reguirenents specification stage, the data 
dicticraiy can provide the means to detect existing inaccu- 
racies in definitions and tc correct tnem before the system 
operation. This is because the DDS contains the overall 
scope of the reguirenents tc be specified. 

During the product design and detail design stages, the 
EDS can help because it contains the design details cf bcth 
data and prccesses, which can be shared by all meiiters cf 
the design team. Paiticularily in database design the EDS 
can record multiple user views, pass output from the logical 
design phase to physical design phase, generate multiple 
designs fcr benchmark testing, and verify the existing 
conversions of data in the system. Fcr the rest cf the 
stages the DDS can help in data collection, coding, and 
testing, by providing any desired degree of coordinaticn and 
control ever tasks, generating data structures, storing 
instructions for the staff, describing the various jets and 
activities, and finally, providing a means for effective and 
consistent modification of the system. 

Additicral benefits that can be derived from the EDS 
£Bef, 6] are naming standards, aid to auditing, interfaces 
tc application program development tools, and software 
configuration management. A DDS allows a system tc be 
extended treugh the addition of new entity types, relation- 
ship types, attribute types, and also can be used tc add 
configuration entity types such as reguirements specifica- 
tions, change notices, etc. The major advantage frem the 
use of the EDS is in the case of an active system where the 
system net only records the entities, but also ccntrcls how 
they are revised. 
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lie crgaDizaticnaJ structure for a DDS that is to be 
adopted lust be comaensura te with the size of the activity 
at any ore time. Such. a structure is displayed ir Figure 




Figure 2.1 SPLICE Data Admin. Function Organization, 



2 . 1 , 

3te Ea_^ Administ rator is the person responsible for 
articulating the data policy after the major guidelines have 
been laid down by the designing team. That policy includes 
planning for data collection, its structuring, its storage, 
and its quality control. For tne SPLICE system the Data 
Administrator can be a person or a team located ir any 
place, hhcse main function will be the setting of the above 
policy. 
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Ite Dictionary Ad minis trator who the pers on t€a,,Q 

responsitle for the dictionary system within the Data 
Administrator functicr (eg. recording of all meta- infcraa- 
tion ard meta-data ard its maintenance through the use of 
the cictiorary system/ along with making its facilities 
available to the users of this system). Because ir. the 

5PLICI system the data dictionary is unigue through all the 

system ard no different views of the data dictionary are 
permitted in the various locations, that team or person must 
he unigue through the system. Only that team (or person) 
must have the priviledge to maintain the DD. The Dat abas e 
Ad ministrat or who the person (or team) responsible for the 

technical aspects of obtaining, running and maintaining the 

DBMS. Since SPLICE is a distributed system with datahases 
distributed across €1 different locations, the Database 
Administrator does not need to be unigue. The reguired 
policy and definitiors are setup by the data dictionary 
administrator and this is enough to maintain consistency 
through the whole system. The D at a Q ua lity Insp ection team 
has a role also in the hierarchy, and its function is the 
quality inspection of the information or data, and the 
quality audit trail of the whole system. This can be one or 
more teams. In the case of several teams the entire audit 
effort can he divided among them. 



E. CCMCEPIS ON DDS EILECTICN AND EVAIOATION 

It is very difficult to find a commercialy available IDS 
to meet exactly the requirements of a system under develop- 
ment. A selection and evaluation process composed of 
various steps must be developed in order to select the best 
system. 

four steps are proposed by £Eef. 6] for the process of 
selection and evaluation of a IDS: 
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-leterEine the leguireaects for the dioAiona^y system. 
Ihese shculd be classified as either being mandatcry cx net. 
If net maccatory establish a scale and assign numbers indi- 
cating tie importance. 

-Eevelcp a list cf features of dictionary systems that 
will be used in the evaluation cf systems. 

-Eetermine a mapping from the needs onto these features. 

-lor each mapping, using descriptions of available 
systems, a system can be found either to gualify or net. 
Ihis prccess leads to eliminate systems that are net 
gualify . 

Ke cannet say that the above procedure is perfect and 
does net have a risk for mistakes, because it is subjective 
and varicusly depends on the experience and smartness cf the 
selecticn/evaluation team. Seme more common/general reasons 
leading to mistakes are: Ihe needs were never prcjerly 
assessed, and potential users were not asked the right gues- 
tions, unnecessary but apparently "nice” features were given 
high values, the evaluation cf the system was inconsistent 
because different people evaluate different systems without 
a well-defined measuiement method, undue emphasis was placed 
cn features that will be needed in the future but unimpor- 
tant now, etc. 

Fcr the SPIICE system we cannot follow the above proce- 
dure. SEIICE has decided to use Tandem as their "frent end" 
minicempeter. As a result, selecting a DOS is largely a 
foregone conclusion ir this situation. So we have to use 
Tandem EEMS and the associated dictionary capabilities . 

I. flEEITICHAL ASPECTS OF CIS 

In tie next few years, several extensions to dictionary 
systems, not available today, will most likely be commer- 
cially available. These additions will allow dictionaries 
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to £€ occre effective in interfacing with the infcrnaticn 
resources. The use cf ex tensihility facilities allows an 
instalJaticn to custcnize the dictionary system in order to 
make it effective in such applications. Such examples are 
the use cf CDS to control the total information resource, to 
aid in tie analysis, design and development cf infcrmation 
systems, ard to aid in efficient database design. The last 
application example is the use of DOS ' as a repository of 
informaticn for an entire system. This is exactly the major 
role the ELS has to play in the SPLICE system. 

Eeferring to the SPLICE application environment the EDS 
would reguire users and analysts to define the system data 
elements, files, etc. which would entail updating cld defi- 
nitions, discarding outdated ones, and introducing new cnes. 
In this way standards cf data definition and description for 
application programs can le established over the entire 
SPLICE system £ Bef .. i|]. But on the other hand it is a 
Kerculear task tc retrofit a dictionary to existing applica- 
tion systems. Because of the many above mentioned difficu- 
lies in implementinc the dictionary to old application 
systems, we recommend as much mere preferable to implement a 
dictionary for new applications only. That means that the 
dictionary will he developed gradualy and a long period will 
be needed to be fully impleirented for the whole SPLICE 
system. 

Although DDSs have many advantages, their disadvantages 
should be mentioned as well. Eictionary systems are complex 
software systems and the execution of many dictionary func- 
tions may consume a significant part of the system 
resources. As the scope cf the dictionary is enlarged to 
include always larger number cf information resources, the 
EDS will begin gradually tc look like the major resource 
consumer, and thus the main user of the host computer system 
£Eef. 6]. When we consider active interfaces of the DES, 
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the jrevicus froblea becoaes more serious. If the 
controls a frocess through cne cf these active interfaces, 
it fcllovs that this jrocess cannot proceed until such tire 
as the dictionary system has finished its jot. This delay 
time is added to the fchole process time. Given that there 
can be many processes, the continuous use of the DDS and tne 
accumulated service time may eventually result in a 
hottlenech. 

lie proposed solution for the SPLICE system [Bef. can 
avoid (or at least reduce) this overhead by locating cne 
copy cf the DDS in each LAN. Kith this simple and efficient 
techrigue each user located in any of the 62 stock and 
inventory control points only needs to consult the local 
EDS. The number of users vhc needs the DDS services remains 
the same tut the overhead from the long queuing time across 
the IDN will be recused ty a factor close to 62. 3y 
locating the master copy cf the DDS in one place we can 
solve the maintenance problem cf the DDS, because additions, 
deletions and updates of the EDS can te done only via the 
master copy by the Dictionary Administrator. All the ctler 
copies can be updated only remotely by the master copy 
through the DDN, in such a way as to represent the exact 
image cf the master copy. Eecause changes in definitions 
(deletions^ updates, additions) are not frequent, we esti- 
mate that the whole process of updating the local copies of 
the EES will not be expensive, and the resultant overhead 
will net te significant. Cf course this assumes all 62 
IAN'S are working off the same schema, and the application 
environment is homogeneous across the network. 



ElEEJECEY OF DDS 



A good hierarchioal DDS structure is significant if we 
want to avoid the "bottleneck" mentioned above. A structure 
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is frcfcsed in Figur€ 2.2 and we i)eli.eve that it is less 
expensive in consuminc the system resources than the struc- 
ture cf having different views of the master dictionary at 




Update request 



ligare 2.2 A First ICS Hierarchical Structure for SfllCE. 

each IAN- In particular suppose the copies of the local 
dictionaries are not exact images of the master dicticrary, 
hut are different views of the master, especially views 
containing information only for the local database. In such 
a case it is not useful to separate the definitions frcii the 
actual database since the different views of the whcle 
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catahase are centralized withir each LAN. If a spare part 
for eaanple cannot te found in a local datatase, thee the 
user has to consult the master dictionary to find the loca- 
tion of the reguested spare part because the local oopy of 
the Cota dictionary dees not certain information about other 
data hases of the system. In this case the user has to 
access tie EDN twice# first to consult the master dictionary 
and then to consult the local database in which the spare 
part is located. Ihis procedure can easily lead to long 
waiting times and finally to "hottlen ecx " because the master 
dictionary will have to answer in questions coming from 62 
different lAN's. A second hierarchical structure is shown 
in Figure 2.3 . Ihis structure involves the location of a 
copy cf IE£ in selected nodes instead of each node. £y this 
way we reduse the amount of secondary memory needed to store 
the E/D hut we increase the use of DDN. This increase in 
use cf EEN is inversely proportional to the number cf L/D 
replicated copies. The solution cf locating exact copies cf 
the master dictionary in each or selected LAN’s has the 
disadvantage of consuming more secondary storage hut cur 
estimation is that this is preferable and less expensive 
than the frequent use of EDN in order to consult the master 
copy . 

Ke cannot say that distribution instead of replication 
cf DE£ is an inefficient method not acceptable for SEIICE. 
Since there is not enough experience for distrihuted 
systems# and especially for data dictionaries# we have to 
examine carefully every possible architecture# t he pros and 
the cens cf each one# in order to make the best decision. 
But still we believe that the decision will be based more on 
estimations comming from intuition and less in experience 
and statistical information. Such an architecture is based 
on distribution instead of replication of D/D for SfllCE. 
Ihis is shown in figure 2.4# and will be examined in a next 
chapter . 
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Figure 2.3 A Second DPS Hierarchical Structure for SPIICE 
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figure 2.4 A Third IDS Hierarchical Structure for SEIICE. 
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DDS and a more detailed 
This presentation is a 
concern any particular 




systeii. A cos t/ben€fit analysis can tell us whi ch features 
need to te included in a DDS under development. It is mere 
preferable approach ttan to develop a DDS as described below 
using the landeia DBMS capability. 

1 • Arc hitectu re and Im plemen tation 



Ihe relaticrship 
addressed here. The purpos 
the purpose of DDS is to m 
whether the DDS must be a 
system £ Eef . 6 ]. 

Ihe free-staiding 
systems because each enter 
cons and reach the optimal 
Ihis appicach raises cempat 
and the DBMS, especially 
companies. There are man 
account when deciding wheth 
DBMS-dependent. These fac 
mentation, the scope cf usa 
going to be developed toget 
going to be supplied by the 

Cne other feature o 
whether the DDS should be p 
is a compiler, application 
requires meta-data fer its 
available which produces a 
data. This f uncticnality 
interface and can operate i 



between DDS and DBMS will te 
e cf a DBMS is to manage- data and 
anage meta-data . 2 ihe questicn is 
f ree-standing3 or DBMS -dependent* 



approach is good for ccmmercial 
prise can evaluate the pres and 
decisions whether to buy cr net. 
ibility problems between the IDS 
when the vendors are different 
y factors we have to ta;te into 
er a DDS must be f ree-starding cr 
ters include the method of imple- 
ge, whether the DDS and DBM5 are 
her or not, and whether they are 
same vendor or not. 
f DDS architectural structure is 
assive or active. Suppose there 
program, cr other process that 
execution. There should be DDS 
utcmatically the required meta- 
is referred to as dictionary 
n two modes: Passive where there 



2fleta-data is the data that describes 

3A dictionary system which does not 
imple mentation 

*A dictionary system which does use a 
mentaticr 
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use a 0 BMS 
DBMS in its 



in its 
imple- 



the 



exists an option cf vihethei the process will retrieve 
xeguired neta-data (tirough the dictionary interface cr ficm 
elsewhere) or, in the case where the process already 
contains the meta-data, there exists an option for the 

system tc check whether this meta-data is the most current 
version in the dicticnary. Here the dictionary is not in 
the critical path of a process. A ctiv e where the abcve 
cpticns dc not exist and the process always uses the most 
current meta-data in the dicticnary. The dictionary here is 
in tie critical path cf the process and the process must go 
through the dictionary for the meta-data in order to execute 
properly. 

A res can certain beth kinds of interfaces. We have 
to keep in mind that the interfaces of the DDS system dc not 
cnly concern the DDS itself, tut also other modules with 
which tie dictionary has tc cooperate in order tc maintain 
the whole system. 
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log ical Schema, Ent i ty I y p es , Relation shi p s 



Dictionary schema is the term denoting the Icgical 
structure of a dicticnary. Structural characte ristics and 
contents cf the dictionary schema determine the kinds of 
meta-cata and the relationships to he established ameng 
them. Dsing the entity-relationship-attribute model 
£Bef. 6] fer the dictionary, we define entitie s as real 
world effects or things about which information exists in 
the dicticnary, at t ri bute s as properties (quantities or 
qualities) cf the entities, and r elat ionships as ccnnecticns 
between entities. 

In the DBS, resources such as data, hardware, soft- 
ware, transactions, personnel and documents may te repre- 
sented, and entities, attributes, and relaticnships 
associated with these resources must also be represented. 
Tables I through V at the end of this chapter taken fxcm 
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£Eef. 4] indicate possible data element attributes, 
entity attributes, hardware entities and attributes, software 
entities and attributes, and dccumen t/report attributes for 
the SIIICI system. 

Similar entities in a J)DS establish entity types. 
Attributes can also have a degree of similarity and in this 
case we speak about attribute types. Finally similar 
considerations apply to relationships and so we have rela- 
tionship types, that are relationships between' en tity tjpes. 

Schema descrip tor : In a dictionary schema 

containing all existing entity-types, relationship-tjpes, 
and attribute-types, ary one of tnem can be referred to as a 
schema descriptor. Information existing in the schema can 
indicate which entity-types are members of a given 
relaticnship-type, ard which attribute- types are associated 
with an entity-tjpe or relaticnship-type. 

Intity-tjpes of a CDS can be classified as data 
entity- tipes, process entity-types and usage entit j-types. 
Cn the other hand attribute types can be descriptions, clas- 
sification and audit attributes created by the dicticnary to 
indicate identification of the person who created the 
entitj, cate of entity creation, identification of the 
person who last modified the entity, date of latest modifi- 
cation, and total number of modifications of the entity 
£Bef. 6]. These capabilities are very useful for a system, 
especially one as complex as SfllCE. Using the above capa- 
bilities reports and summaries can be presented on reguest, 
and also we can have a trace of various interactions cn the 
system using application programs for this reason. 

- • In t erfa ces and Comm a nd s 



Interfaces 
allow the user to 
The terminal-DDS 



must be included in a DDS in order to 
communicate with the DDS via a terminal, 
ccmmunicaticn in the SPLICE system 
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carried cut through the Sessicr Services module. This is a 
separate topic which will he examined separately. In 
general an interface can be as shown in Table VI. 

Cn the other hand commands can be classified, on the 
basis cf their functionality, into various ca 
shown in Table VII. 

A dictionary system can be regarded as a software 
product that helps ir storing information about data that' 
already exists in databases. Both DOS and DBM 
descriptions and chaiacter istics of data elements and with 
the logical structures obtained from these elemerts 
their relationships. A closely integrated dictionary system 
and autciated database design process have much to offer. 
The interfaces between a dictionary and a database design 
process can be divided into two broad categories: 

-Initial data entry and editing 
-logical model structuring 

Init ial ^ta entry and ed iting : For ^ta ent ry the 

data reguirements information needed by automated database 
design procedures is almost a complete (proper) subset of 
the irfcmation norjially stored in current commercial 

dicticrary systems. For the SPIICE the files already exist 
tut the dictionary does not. Therefore the whole design of 
EDS must provide for initial detection and avoidance of 
duplicate entries. As soon as the design takes care of that 
durinc the initial steps, then the entry cf rnfcrmation 
about raw data elements has to be made only to the 
dicticrary system. Kext an interface must exist in order to 
allow the design prccedures to access information in named 
aggregations (local views). lor editin q, the initial data 
entry is rarely clean in the sense that names, usage, and 
char ac t eris tics cf -the data elements may not yet be stan- 
dardi26d across local views. Synonyms, homonyms and incon- 
sistent characteristics of the same data usually result when 






data legtirements are gathered frcm different sources. Ihe 
editing phases of the automated design procedures, and the 
reports produced therein, can serve as an input filtering 
function fcr the dicticnary. When the interactive editing 
phases are completed, oisolete information (eg. non-standard 
names) can he removed from the dictionary, such tnat the 
informaticn remaining permanently is clean and consistent. 
Again, as ve mentioned in a previous section, this can he 
done cnly for new applications because the tasc of retro- 
fiting a dictionary tc existing application systems is very 
difficult. 

log ica l model st ruc t urin g ; The structuring proce- 
dure fcr initial design should he able to extract filtered, 
unstructured data element information in named aggregates 
(local views) from the dicticnary such that the ccmfcsite 
model and the derived logical designs can be generated in 
the rcrnal manner. 

lor adding new reguireients to existing desigis and 
when processing new functions or adding new data tc an 
existing database, the design process should be able to 
extract frcm the dictionary a description of the existing 
design alcrg with the filtered unstructured data element 
inrcrmaticn for that which is new. Various levels of 
constraints on the freedom of structuring processes can be 
set here in order tc facilitate the whole design effort. 

Cnee the automated design process is completed and a 
suitable logical design has been obtained, the results must 
be stored in the dicticnary. Assuming the unstructured data 
elemerts are already described in the dictionary, the rela- 
tionships defining segments, databases, logical relations 
and secondary indexes would new be stored. 
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TABLE I 

Data ElemeDt Attributes i 

Type 

Bange 

lexgth 

Unit of measure 
Us age 

larguage canes 
Be feti tioDs 
8 8 le V e Is 
Key 

Default value 
Display fornat 
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TAEIE II 

File Entity attributes 
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TA£1£ III 

Selected Hardware Entities and Attriiuts 



frccessing sisten 
Secondary storage 
Ccamunicaticne system 
Concentrators 
lerminals 

lAK I/O perifherals 

A ttritu tes 
lype 
Model 

Model number 
Serial number 
Hf ger ' s number 
Source 
Features 
Description 
Cccu- references 
Osage by site 
Ccst 

Maintenance activity 
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TABIE IV 

Selected Software Entities and Attritutes 
Er ti ties 

Cferating system 
Oferaticnal support system 
' Environ mental system 
Application software 

At tr itu tes 

f r cgram-id 

Revision numler 

Eevisior date 

Cate compiled 

Type of compiler 

Catch level 

Change level 

license 

Date released 

deduct numler 

Sc urce 

features 

Dccumentaticr 

Dsage 

Cost 

Maintenance activity 
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TABU V 

Docuient/Hejfort Attributes 

Naae 

Nunier 

Iicduct numler 
Release date 
Eevisicn nuitex 
Source 
f e atur e 
CescriptioD 
Cuartity 
Ccst 



TABU VI 

Kinds of EDS Interfaces 

Ccamand language 

Screen crierted interface 

Fixed fcrmat batch data entry facility 

£r ogram la tic interface that allows user written 
apjlicaticns programs to access the dictionary 
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TA£LE VII 

ComnaDd Categories for DDS 



Dictionary Jiaintenanc 
Eejort and cuery 
Data structtie interf 
Extensitilit j 
Status related 
Se curity 

Dictionary ficcessing 
Dictionary acministra 
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III. IMEGBATION 



A. ItE iECElEH 

An actives data cictiouar^ is desiraile for the S 
systeir. It is also known [Eef. 8] that most dicticr 
fail tc neet this chjective. A prerequisite t o an a 
dicticnary is a high degree ox interaction between 
dictionary and various other software elements such a 
EBHS itself, but also including guery languages, r 
generators, application deveic|ment aids, and the like, 
architecture for a certered and highly integrated EES 
from £Bef, €] is shcwr in Figure 3, 1 . 

lie existing dictionaries today are noticeably un 
grated, and hence less than active. Such a situati 
shown in Figure 3.2 (taken from £fief. 8] ) concernin 

IBH DB/EC data dictionary and related software. Nctic 
particular, that whereas scoe batch feeding of dat 
provided tc and/or ficm the dictionary, there are nc 
than six places where database definition data is store 
addition tc data definitions included in actual prcg 
£Ref- S]. Ihese are : 

lie lE/EC dictionary itself 
lie EEE/PSB libraries 
Ihe CCECl copy library 
lie database design aid (EEDA) 
lie GIS data definition tables 

lie application development facility (ADF), segment 
rules in an ItS/EC environment, or in 



^Active tp some degree because if it is too 
can Iccse efficiency 
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development maragemect system (DMS) files in 
a CICS environ irent. 

Here is no guarantee that each of these descriptions 
agree at any point in time. Other data dictionaries 
have a tigier degree of integration but no one is close 




Figure 2-1 Highly Integrated D/D Centered Architecture. 

to the degree of integration suggested in Figure 3.1 . A 
high level of integration is very much needed in order to 
support the advanced functions of an active dictionary. lo 
see that tetter, consider a user who wants to Jcnow what data 
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is ix th€ database, cr a DHL routine which wants tc edit a 
field flier to updating the database, or the database access 
systen which needs tc know if a user password is valid fer 
updating a certain record. All tne above functions recuire 
direct access to the data dictionary. 

Ihe extent to which a LDS ^^ualifies as being "inte- 
grated’' is a relative notion determined by the scope of its 
netadata and the way that it interfaces with other software. 
Ihe most cciimon use of the tern) "integrated" is with refer- 
ence tc a E/D that is the sole source of metadata in the 
systen. Ihe integrated D/E is accessed for all references 
to meta data. Most cf the ccmmercially available ECS have 
reached a high degree of integration with their environ- 
ments, and this results in multiple sources of descriptors 
within the systems. Ihe DDS permits these systems tc access 
the E/E indirectly and convert the metadata of each system 
to the fcrnat reguired by the C/D £ Ref , 5], So for example 
a DDS might communicate with a compiler in either cf two 
way s : 



-Ey generating file and record definitions 
that the compiler accepts via copy statements. 

-Ey reading source programs and creating 
transactions to load the CDS with descriptions 
cf files, records, and elements. 

Cne additional area which demands investigation fer the 
develciment of a succesful DDS concerns integrating schemas 
which describe the logical structures of all data types 
existing in a distributed (like the SPLICE) database. This 
feature permits the determination of a data file's logical 
structure as well as its identity and location, and could 
possibly be essential to tie development of guery and data 
model translation shemes. Ihe existence of a master schema 
also permits the logical relation of data across file 
boundaries; then all files in the network can be considered 
as areas within a single large database [Ref. 9]- 
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Figure 3.2 lEH Data flaiagement Architecture- 
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E. IBIEGEAIION CF EE£ 

liree aspects of integrated DOS in the centralize 
distrituted datatase envircnnent for SPLICE* are cf 
interest and must be emphasized £Eef. 5]. 

-Ihe software interfaces 

-Ihe ccrvert functions 

-Ike environmental dependency between the DDS an 

IBMS 

fl IIS is integrated with other software package 
facilities that; 

-Allcw direct and indirect access to the E/D 

-^utcmatically capture the metadata used by 
syste ns 

In the next three subsections we will examine the 
most interesting aspects of an integrated DDS. 

1 . Sof twa re Interf aces 

A software interface permits another syste 
access the E/D either stati cally or dynamica lly. Fir 
consider the static interface , which links the D/E 
another system indirectly via the extraction of a fi 
formatted metadata. For the static interface of a DES 
IBMS, fcr exanple/ the data dictionary administr 
follcwinc the specifications of the data administr 
enters intc the EDS all pertinent transactions to defin 
database and the database administrator using the 
definitions describes the datatase. After reviewing 
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*Cur approach for the SPLICE database and data 
dicticnary aistributicn is hybrid. SPLICE is a distributed 
systen, nut the databases are centralized within each LAN. 
Also the dicticnary copies at each of the selected LAN's are 
exact copies cf tie master dictionary and different 
dicticnary views are not permitted. So the whole SPLICE 
systen can le viewed as a distributed system, but ccrcerning 
each particular IAN, the datatase and data dictionary can be 
said tc follow the centralized database environment concept. 
So beth ideas of certralized and distributed envirennerts 
can be applied to the SPLICE with slight modifications. 
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accuracy of this datatase description, a cottniand is gener- 
ated for EES that uses this description to produce a file 
containing the DDL. The EBhS’s DDL processor then trans- 
lates this generated DDL intc a scheaa file that the run 
time unit of the DBKS can access. No run-time connection 
tetween the DDS and the EBBS exists here; the EEMS's 
processor is not executing during the DDS’s DDL-gener ation 
process . 

Static interlaces differ somewhat, depending upon 
whether they interface the EDS with user-written prcgrais or 
with vender -supplied software packages. Static interfaces 
for programs written in languages such as COBOL and PI/I 
produce file, record, and datatase descriptions for the user 
programs frem the data dictionary £fief. 5]. These inter- 
faces senetimes feature edit capabilities, format options, 
and various other functions to make the interface more flex- 
ible. Edit capabilities may include being able to add 
prefixes and suffixes and even to replace entire names, 
format options may ccntrcl indentation, level-numter incre- 
ments, seguence numbers, and line identifiers. Inclusion of 
various clauses such as cemments, condition names, and 
initial values also may be allowed. 

Static interfaces fer software packages, such as IDL 
processors, communication monitors, and guery processors, 
produce formatted statements for those packages cr create 
specially encoded control files for their use. 

Static interfaces are prevalent because cf tneir 
utility, capability, and efficiency. With powerful static 
interfaces, the data administrator can guickly charge 
formatted metadata ci create new formatted definitions from 
existing D/E entities. The static D/E can he made compat- 
ible with many versions of other software packages and can 
he developed independently cf the source code of particular 
software packages. i disadvantage to the user of a static 
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face is the extra effort that may be reguired to 
ate and catalog letadata for the D/D. 

hore significantly, the static interface itself has 
patilities for updating the metadata of the systems 
which it interfaces. Without adequate synchronisation 
cntrcls, the metadata in the DDS and the metadata in 
systems may become inconsistent £Eef. 5 J. 

Eynamic interfaces provide direct access by the EDS 
her software modules. This direct access is ccomonly 
ved via high-level interface commands that shield the 
are package from the physical details of the D/E- Ihe 
rds activate standard DDS functions, so as tc select 
rtity occurrences that satisfy a particular ccnditicn. 
car provide a facility that maxes commands available 
gh call statements; any program can then access the E/D 
ut knowledge of its physical structure. Dynamrc inter- 
previde consistency control and capabilities fer heth 
e and retrieval. Charges to the D/D are automatically 
cted in the next execution of any software packages to 
the D/D is interfaced; nc intervening procedures are 
red as with static interfaces. A software package can 
tly retrieve and update metadata stored in the E/D if 
user has the authority to do so, and the software 
ce has a such capability. Otherwise tne software 
ce and the user would only have read authority tc the 



Here is where special attention must be given when 
nirg a DDS for the SPLICE. We said previously, when we 
ihed the first and the second hierarchical structure 
ElICE, that the local copies of the SPLICE DDS will he 
images of the master copy. With this approach cne can 
ne what will happen if cne program in any of the 62 
attempts to update the metadata stored in the DES. 
hole consistency of the system is gone. The local 
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s ir the master copy. Ihese 
mitted to the various Iccaticrs 
ard executed. This we lelieve is 
€ proposed DBS architecture which 
er the whole SPIICE system. He 
of operatior is purely dyramic. 
He might call it is a hybrid 
the security ard validity checks 
ed. 

c irterfaces ircurs sigrificart 
ard complex structure cf DES. 
port aids, such as preprocessors, 
ard desigr aids gererally car 
e resporse time is rot critical, 
rcy is critical for trarsacticr- 
ererce the D/D. 

tial overhead, commor queries may 
ir the D/D. Arcther techricue 
is fer the software package to 
reguired for a trarsaction at 
for this trarsactior orly irvclve 
from [Eef. 5] shows some typical 
irterfaces for DDS. 

tware irterfaces the irtegration 
ermert is provided ty cervert 
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fuitcticus. 1 DDS or ganizat ior. has a lot of progr ais# report 
and files to manage. The data/data dictionary administrator 
must encode thousands of maintenance transactions to captcre 
the metacata of all these applications. The convert func- 
tions of a IDS scan source programs, database descriptiors, 
and teleprocessing environment descriptions and automati- 
cally produce maintenance transactions, thus sparing the 
data administrator many hours cf manual .effort. figure 3.3 
from £Bef. 5] illustrates the flow of data through a typical 
convert function. 




Figure 3.3 System Flow for a Convert function. 

Inpcts include the source language statements and 
the E/E; outputs are a file of transactions to he input to 
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the L/L maintenance aodule, (in the case ci SPLICE that 
refers tc the mainterance module of the master o opy) and a 
report. 

Ihe D/D mairtenance transactions include descrip- 
tions of databases, tiles, records, groups, elements and 
programs. Ihe prime purpose of convert functions is to 
convert metadata from both user-written programs and from 
local LEES and its related components. lable IX illustrates 
in summary the typical D/D convert function transactions. 

Icur major charac teristics £Ref- 5] for convert 
functions are: 

Ihe ccntert of the genera ted transactions where the I/D 
mainterance transactions created by a convert function 
usually also contains the reia tionships between data 
entities . 

I he input fil e to a convert function that can be a source 
program cr a library file. 

command op ti on s which may include the ability tc change 
names, elect lines tc scan, select types of transactions to 
create, and override generation of some types of metadata, 
where tie ability to analyze the metadata of source programs 
can make tie DD£ a valuable tccl for auditing adherence to 
software ccntrol tecirigues. 

- • En v iro nm ental De pen den cv 

Ihis characteristic of a DDS is determined by its 
reliarce cn a specific hardware configuration, an operating 
system, a IBMS, or a teleprocessing monitor. Under ideal 
conditicrs a DDS must have the capability to operate in such 
an environment without losing efficiency and f uncticnali ty . 
Eut sometimes the practice deviates from theory. 
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the DBMS accesses 



In a completely integrated DDS 
stored databases via the D/E. In a less integrated system, 
the EEMS may maintain its cwn directory file for accessing 
stored databases. 

In the indejendent approach the DDS is completely 
autoncmccs, it dees net rely cn any particular DBMS, and the 
EBMS maiitains its CKt source ef metadata. 

In the DBMS application approach the D/D appears to 
the DEMS as just another database. The DBMS maintains its 
cwn metadata for each database and these metadata are sepaT’ 
rate in cm the D/E. 

Eor the SPIICE system, it is proposed that the 
emb edded approach be used, where the DDS is actually a 
component of the DBMS's. Ihis approach provides complete 

integration of the EES. The D/D is the only source of 




Figure 3.4 SfllCE Embedded Approach to DDS. 



metadata. The 
facilities and t 
stored databases 
exist for the D 
completely cn t 
shown in figure 



DBMS utilities provide the D/D management 
he DECS uses the D/D to directly access the 
. Nc ether directories internal or external 
EMS, and the DBMS and its facilities rely 
he D/E for metadata. Such a structure is 
3.4. 



Sc for example a ^uery processor extracts user views 
from the DIS and the DBMS applies integrity constraints 
specified in the DDS ty the DDS administrator before storing 
a data element. A najor difficulty here^ that the SEIICE 
designers must overcome, is the fact that the DBMS for 
SPLICE already exists but the DBS does not. The emhedded 
approach is easier ard simpler when both DDS and DBMS are 
developed in parallel, but this is not the case fcr the 
SPLICE. Sc special attention and effort must he applied 
during tie IDS develcpment phase. 
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Ijpes c£ Software Packages I D/D System 



Module 

III Eiocessor 

lataiase control system 

Irepiccessor 

Cceij/update Processor 
Eatch-ccde generator 

Scurce-progr am manager 

leleprccessing monitor 
lest-data generator 
Design aid 



Des crip ti on 

Creates a schema file 

Bun-time unit of a DEMS 

Translates DML into CAII 
sta tements 

Provides direct end-user 
access to stored 
databases 

Reduces the time to 
develop a standard 
function as compared 
to a compiler-level 
language 

Provides security 
protection, data 
compression and editing 
capabilities for source 
programs 

Provides the capability 
of interactive computing 
to remote terminals . 

Creates test files 
and databases acccrdirg 
to user specifications 

Analyzes and generates 
designs of databases 
or information systems 
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TflBl£ IX 

Iransacticis for D/E Convert Function 



Module type 
Eiogratt ning 


Generated transactio ns 

Element, group, record, file, 
and sometimes Sutschema 
and process 


Catalase description 


Eatabase, file, subschema, 
relationship, record, 
group, element 


3 elepxccessing 


lerminal, line, processor, 
transaction 
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SE^IOK SE BVI CES AND MTA DI CTICNA BI 

A. GINIfAl 

lie textt '’session'' is defined in £Be£, 4] as fcllcws: 

"Sessicn: All the activity (message exchange and prccessing) 
which takes place hetween twc or more processes for the 
duration of a sirgle task (e. g. text editing or processing 
of a trarsaction file)." 

lie session services module of the SPLICE has to play 
the rcle cf coordinating the activity of the other func- 
tional modules and providing them with work instructions via 
the service codes it inserts in messages to the FM's. Ibe 
sequence cf operations may he data dependent or highly 
interactive, so in some cases, work breakdown cannot be 
completely determined in advance by the session services. 
In such cases sessicr services passes control to the first 
(contrclling) EM which is tc perform an operation, and 
subsequent "calls" tc other FM's, if any, take place 
according tc processing conditions. In all cases however, 
sessicn services passes control to the first (controlling) 
IM. However in some cases, all the FM's which will be 
involved cannot be determined in advance. Session services 
retaixs and maintains state information until either a 
completion message or error message has been received from 
the ccntrclding FM. In the case of a message which is 
destined fcr an object located in another network, this fact 
is indicated in the "message type" field. The physical 
destination address would have been obtained previously from 
the data dictionary which exemplifies the relaticrship 
between session services and data dictionary. 
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processing takes 
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LOCAl EM 
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EM I 




figure 4.1 Cooferation Between SS and Functional Mcdules. 
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SessicL services is used in a distributed envir cnment 
ard involves the seven layer architecture model of the ISO 
for distributed networks. Ihe ISO seven layer architeotcre 
is a standard one ard involves the following layers with the 
assooiated functions: 



lajer 


F cretion 


Application 


User process 


Present ation 


Fermat data the user wants it 


Session 


Sets up session between 
communicating processes 


Transport 


End to end control 


Network 


Switching, routing 


Data link 


Reliable transmission between 
twic nodes 


Physical 


Physical transmission of bits 
between two nodes 


The complexity 


cf the SPLICE processing 



requires that user terminal processes be given cotsiderable 
assistance in carrying out their tasks £Eef. 4]. Session 
services can provide this assistance. User teriinal 
processes specify task envircnments, largely by task name 
and tie assistance of the data dictionary, where necessary 
(figure 1.1). 



E. AECHllECIUBE INTEEFACES 



interfaces 
In particu- 



inter faces 
the data 



In the SPLICE layered architecture, the 
between the layers are critically important, 
larly we are very interested in the software 
between the modules which communicate with 
dictionary. These acdules are the session services module 
and tie LEWS module. Some forms of software interfaces 
between EBMS and D/D can be feund in the current literature 
XBef, 5]. On the ether hard no one has yet defined the 
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required software irterfaces tetween the D/D and session 
services mcdules. We believe that the above lentioned scit- 
ware interfaces aust he of the same type and closely related 
to the interfaces tetween the end user and the session 
services. In a centralized system where session services 
does net exist, the end user has to interface directly with 
the E/D, tut in a distributed system the session services 
module acts as the mediator between the end user and the 
data dictionary. As a minimum then, the interfaces tetween 
session services and the data dictionary in a distributed 
system must include the interfaces between end user and data 
dictionary in the centralized model. 

The interfaces between the abeve modules must be designed 
to accemmodate new mechanisms and, as far as possible, new 
functions when they may arise. As new mechanisms and 
netwerk functions come into use in the system, it is highly 
desirable that previously written programs continue to work. 
Ihis is achieved by designing the interfaces appropriately 
and preserving them. In the seven layer architecture, 
layers h,5,6 and 7 provide end-to- end communication tetween 
sessions. in user machines. Layers 1,2 and 3 provide cemmiu- 
nication with the nodes of the shared network. 

Because the SPLICE system uses a modified ISO layered 
approach, the interfaces between machines need to te defined 
in terms of the layers. Sc we will have layer headers and 
control messages that are passed between the layers. Ihe 
application programmer does not need to know anything about 
these. Fcr example any command language, using oemmands 
simmilar tc GET, PUT, OPEN, CIOSE and DELETE, can refer to 
data cr facilities in a distant machine. 
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c. 



lEE EEE'IOM EEEVICES MOIOIE 



U:€xe are differences in the session services provided 
depending ufon tjpe cf netvcrk. In the distributed envircn- 
ment different types of user software need different types 
cf session services. These differences involve net only the 
software but also the architecture. So one set of session 
services ray be provided for cne manufacturer’s architecture 
and a different set fer another. This is very important for 
the SIIICE because tie hardware used throughout the system 
varies. It may be possible that services provided across 
the system are of different types. However it is desirable 
to have common session services, because this will facili- 
tate tie maintenance tasJe. also for interfacing purposes 
want session services to present a common image tc the 
system. This can be accemplished by bidding necessary 
interface units from the session services. In [Bef. 10 pp 
U91 ] there is a descripticn cf possible functiors cf the 
session services subsystem in a distributed network. These 
functions are generally divided into three large groups; 

-functions required when setting up or disconnecting a 
session. 

-functions used during the normal running of a session, 
-functions employed when something goes wrong, such as a 
rode failure or a protocol violation. 

More precisely these functions are divided in the 
following categories: 

--Assistance in establishing a session 
--Easic networking functions 
--Application macicinstr actions 
--fregram control facilities 
--file access functions 
--fiecovery and error control 
--Editing and translation 
--Dialogue software 
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--Vixtual of€raticns and txansparency 

--Ccupaction 

--Paynent functicis 

--Security and audit functions 

r. IKlEfFACES 

Functicnal interfaces between session servic es and data 
dicticnaiy aust permit ether software modules to access the 
C/D and convert metadata into the format reguired hy the 
CDS, 

A ECS provides many functions and features such as: 

Mainter ance 

Eatensitilit y 

Eepert processor 

Cuery processor 

Convert 

Software interface 
Eait facility 

lie software interface function must provide a fcrEatted 
pathway erahling the CDS to provide metadata to other soft- 
ware systems such as compilers and DDL processors £Eef. 5], 
to retrieve information from tie DDS^ to update infermatien 
where it is permited, and to chtain the restricticr proto- 
cols for data consistency and integrity. The software 

interface can generate file descriptions for storage in a 
program litrary, or accept the user identification and 
generate a copy of that user's database view. It is not 
possible fer this study to describe precisely the software 
interfaces needed fer the SfLICE system. Because this 

system is under development/ many aspects of the system are 
still unknown and the software modules are not yet described 
in full detail. So, we will only outline some of the soft- 
ware interfaces without claiming that these are sufficient 
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for the SfllCE system. Icterfaces can te added tc 
system during the later stages of the system life cycle 
existing interfaces can also he changed or imfrcvec as 
needed. 

Eecause COBOl is used throughout tne system, the COBOL 
"GENEEAIi" command car create from the D/D fully formatted 
file and record definitions that can be stored in a library 
file. Included can he most COBOL clauses such as 88 levels, 
BYNCEiCNIZZL, EEDEFIliBS, and CCCUES. The OPTION clause of 
this ccnnand can permit changes in names, the designaticn cf 
sequence numbers, level numbers and identifiers, and the 
inclusicn cf program comments- An example of the use of 
this ccmmand can be found in £Ref. 5 pp 261], The genera- 
tion date, last revision date, and revision number can be 
automatically recorded in both the listing and the C/C. 

The output file can also contain jcb control statements 
to be included on the output file. Then the output file can 
te executed as a jet that creates and catalogs the COBOL 
metadata as a member of a library under control cf any cf 
the various source program managers. 

A Chi processor can be used also to interface between 
the session services and data dictionary. A source program 
triggers the DHL processor by sending a service code, 
through the session services, and the DHL processor inter- 
acts kith the data dictionary/ directory . The output cf the 

IHL processor is an expanded source program that is sent to 
a compiler for compilation. 

ether kinds of interfaces include query processors, 
source program managers, various user interface facilities, 
and ether software packages. 
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Figux€ 4.2 



SoftHare Interface Using a DHl Processor 



V. £ZD IK DISTBIEOTED environmeni 






IBIECEDCTIOH 



Id this chapter ie will ccrsider the design and fucction 
cf D££ ir the distrihuted database environment. Some exten- 
sions tc the centralized D/D are needed in order tc enable 
it tc iuECtion effectively in a distributed environment. 

lie distributed system is a subset of a general informa- 
tion system. It is not necessary for the user tc knew how 
cr where the data is stored or in what way the data will be 
accessed by a program or hew and where the processing is 
accomplished. Unless the dictionary plays a highly active 
role in the running of the distributed system, there is 
little need to try tc share cne dictionary over the entire 
network. Ihis is because there is not likely to be a large 



amount cf update activity in a dictionary. 



The dictionary 



can ncrmally be reproduced at each node and this is the 
proposed solutions fer SPLICE. By using such an architec- 
ture, prchlems of updating the dictionary across the network 
can be solved without much overhead. 

Cf course the problem cf distributed control in a 
network is more complex than that of the hierarchical archi- 
tecture cf dictionary systems which has been discussed in 
chapter twc. This is one reason, in addition to the lack cf 
experience with distributed data dictionary systems, why we 
proposed replication instead of distribution of the data 
dictionary for SPLICE. The mere the dictionary system acts 
as either the control mechanism or a repository cf control 
informaticn, the more complex the DBMS, network operating 
systems, and dicticrary system interactions become. Eor 
example, in the case where we want to determine the best 
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locaticD icr running a query aqainst a distriruted and 
partially replicated catalase £Ref. 6] the dictionary system 
is required to retain infcrmaticn on the location of all 
data. Indeed, this nay be highly dynamic itself, and there- 
fore the line tetween a dictionary and "real" database 
lecomes very fuz 2 y. 

Creation of a distributed information resource inplies 
that the number of iardvare and software components are to 
te designed and integrated into a controlled environment. 
Ihese components in the SPLICE include several databases and 
database management systems, user language interfaces, data 
diet icnary/ directory catalogue, transaction controllers and 
data input/cutput ccrtrol modules. We will describe the 
varices system components and we will also attempt to demon- 
strate the integration of them with the international orga- 
nization for standards (ISC) communications architecture, 
and a data storage ard retrieval architecture (DSEA) . 

In general, a distributed system must provide to the end 
user transparency, data sharing, data transfer, process 
transfer, or a facility for combination of strategic, mana- 
gerial and operational reporting. In order to do that there 
are several environmental constraints that must be satisfied 
[Bef. 12% Ihese are: 

Eata ccmmunicaticrs 
Eata storage and retrieval 
Me tadat a 

User language support 

Process and report management 

Information representation 

System management 

Integrity 

Security 

Fox the SPLICE system, communication must be integrated 
with cooperative processing of the various different 
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existing software and hardware. In order to do that we need 
to address the considerations of the database interface with 
distributed system tasks. 

A distributed database is particularly useful tc a^pli- 
caticns that involve extensive processing in different loca- 
tions. 5ILICE fits exactly in the above concept as do 
airlines, banking, retail, and military command and control 
applications. The distributed database of the SPLICI can be 
allocated among the nodes cf the network according to 
various existing criteria for fragmentation. Ic avoid 
confusiot in distributed systems two different terms are 
used : pa rti ti one d database which consists cf non overlap- 

ping subsets, and r etl icat ed database, which has seme data 
redundancy [Eef- 5]. Eeplicaticn enforces the locality and 
availability of the database and reduces the freguercy of 
accessing the DCN, but recuires the DBMS tc provide mere 
sophisticated concurrency and recovery procedures. Ic avcid 
expensive overhead in data management, restrictions must be 
established as to the degree cf data replication permitted. 
SPLICI belongs in the class cf replicated database because 

the same item of tie database can be located in several 

locaticrs and the local databases provides informaticn for 
items stored in cnly cne location. 

Ma^'ci problems in the development of techniques for a 
distributed database are due to communication vclumes and 
delays and to the potential for parallel processing. 
Sometimes it is very difficult tc apply working solutiers to 
distributed data processing which are borrowed frem the 
centraJixed processing concept. These solutions often werk 

environment and do not transfer effi- 
delays may occur. Parallel 

processing also has the petential to increase throughput, 
hut reguires complex controls to synchronize concurrent 

activities at dispersed sites. Because a data dictionary is 
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a database containing metad 
in distrihuted databases a 
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n proposed for SPLICE DDS, we can 
plication instead of distribution 
and, there are some other reascn- 
cw more closely the distrihuted 
with distributed systems is rela- 
eded to reach a decision must be 
r to avoid mistakes. 
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lie designer of a DDS encounfers some similar tasic 
froilers as does the designer of a distributed dataiase. 
fihen \ie design a D/E we must determine the extent cf envi- 
ronmental dependency hetween the D/D and the DBMS. As we 
said iefcre , the distributed D/D is an extension ci the 
centrali 2 ed one and sc the three basic variations to the 
type cf relationships between a DDS and a DBMS are still in 
force. In the independent distributed approach the EES has 
no rurrirg connections to any portions of the DBMS ard is 
not actively or directly used in transaction processing by 
the EEMS. In the DEtS-app licaticn approach the D/D is jrst 
another distributed database to the DBMS and separate data 
management functions are not needed to handle the D/E. Ihe 
EBMS nay manage its cwn run time directory that is separate 
from the D/D. In the embedded distributed approach tie D/D 
provides the run-tine directcry for the DBMS. All the 
compcnents cf the DE£S obtain their metadata from the D/D. 
Ihe size^ location, and contents of the D/D would also 
affect tie performance of other DDS functions such as main- 
tenance, reporting, and guery £Bef. 5]. 

E, A ECIEl FOB A DISIBIBDTED EDS 

In this section we are gcing to examine a distributed 
model for SPLICE DDS. Its structure is shown in figure 2.4, 
and involves the partition cf the global DDS into different 
views containing infcrmaticn for one or more local data- 
tases. Ihese different views can be located at each or 
selected lAK’s. 

Ihe glclal {cr network) dictionary is the nucleus around 
which all the management functions of a DDS are centered. 
Jt contains [Eef. 11] information to start every maragement 
process cf the SPLICE distributed database. In particular 
it certains: 

a .-Inf orma ti cn for the DDS design 
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-lile access frcgr^airs 

-Ictal voluffcs of queries for each file 
-Ictal volunes of updates for each file 

3his statistical infor naticn is very useful especially 
for evaluating the optiral cumter of redundant copies. 

t .-Inf oima ti cn fcr the distribution function 

-Nujnber and types cf transmission links,, their urit 
ccst, tbeir mean utilization factor 
-Renting tables 
-CiU workloads 
-risk utilization 

liis information can help determine the optimal alloca- 
tion of redundant file copies and of possible operation 
parallelism . 

c. -General information about data and how data is shared 
among tie various nedes of the system. What the number of 
I/D copies is and where they are located. 

d . -Inf erma tion about existing constraints, status cf the 
system, node failures etc. 

e .-Inf erma ti cn about data transportability 

f .-Inf orma ticn related tc data used by applications 
having a global view. Such applications are for example 
those where different local databases are involved for 
executicr. We said in a previous section that sometimes 
data redundancy is preferable over the frequent use cf the 
DDN. That means information about the sites where a compo- 
nent (i.e spare part) is located must be somewhere in a 
central position. Sc in the case where the component cannot 
be found in the local database, tbe user has to access the 
global data dictionary to find tne places where the partic- 
ular item is located. 
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etc) 

Eata translation laps, access paths 

Eata entities 

Ccmmcn procedures 

Events and their interrelations 

Ibis dictionary lUst be able to answer queries about D3 
and EEMS’s involved in a transaction and how the transaction 
can be formulated to cdtain the most efficient result. 

local dictionaries include information about local data- 
bases anc applications, local data entities, local proce- 
dures, local interrelations, physical storage structures of 
local data, access methods, access paths, physical storage 
devices, and redundancy of data items. 

In [Eef. 11] a structure is proposed for a distributed 
E/D quite different ficm the SPIICE approach. This struc- 
ture, as shewn in Figure 5. 1, involves the existence of: 



Ketwerb dictionary 
Global external dictionary 
Global conceptual dictionary 
local external dictionary 
local conceptual dictionary 
Internal dictionary 



and each cne of the above p 
Ihis architecture which 
ably tcc ccaplicated to be 
theoretical model and 



erferms a different fancticn. 
is purely distributed, is preb- 
iffplemented for the SEIICE. It 
if we try tc implement it, we nay 



IS a 
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Figure 5.1 i Purely Distributed Approach for a ICS. 

face seiicus interface prctlems, resulting in the data 
dicticnary becoming the main resource consumer. 

Ibe functions we intend to include in the SPIICI EDS 
will flay a major role, if we want to avoid complex struc- 
ture and saturation. These functions must be the miniaua 
possible needed for the proper operation of the system. We 
believe, in the case where the distributed instead cf repli- 
cated approach will be followed, the architecture shewn in 
Figure 2.4 is the mere practical. 
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Fcllcwing the atcve architecture a global dicticrary 
located it some code has the rcle of maintaining consistency 
throuchcct the whole SfllCE system. Requests f or updates, 
deletions/ and additions are routed through the data 
dicticrary administrator and after an evaluation prccedcre 
the glohcl dictionary is updated. Then the changes are 
transnitted to various locations where the local copies are 
updated. Also updates are transmitted to the data 
directory . 

Lata directories can be located at the inventory control 
points (ICT J . In contrast with the data dictionary, the 
data directory contains global infcrmaticn only about 
subject, service code, object name, and address. All the 
ether inforication is located in the global and the various 
local dictionaries. The data dictionary administrator is 
responsible for maintaining the data directory, as well, 
rifferent views of the global dictionary are located in 
various lAN’s. Each view can serve one or more LAN’s and it 
is preferable tc be located at the LAN where it is most 
frequently used in order to avoid unnecessary usage cf the 
LDN. 

Khen an item is not found in the local database the user 
routes a value location request through the session services 
(service code) to the data directory, and the data directory 
replies with the location address. Using the previous 
inforoaticn the user can request and establish a session 
with the remote database where the requested infcrmaticn 
resides . 
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VI- CONCICSICM MR BECOMMEN^TIONS 
A, CCSCIDEIONS 

Cur chj€ctiv6s, as descrited in the first chaptei, were 
to iDvesticate the area of data dicticnar y/directcry 
systens, in a distributed environment, to outline the 
advantages/ disadvantages ox these systems, to present the 
underlying ideas, tc examine the benefits for the 5ELICE 
systeir ficm using a diction ary/directory system, and finally 
to delineate tie interface reguirements between a data 
dictionary/ directory systen and other functional modules. 
In addition to the above objeotives we discussed also some 
ideas ocncexning the organixation of the data administration 
function, and four hierarchical architectures for CDS, each 
one with a different degree of distribution. 

Ihe first architecture is based on the replication of 
the E/D. There are no different views of the D/E, only 
exact copies of one view located in each LAIl. Osing this 
architecture we have 62 replicated copies of the E/D (the 
same as the number of LAN's), each containing the informa- 
tion (metadata) about all SfllCE data base definitions and 
functions residing in each LAN. This architecture minimizes 
access tc the DDN but has the drawback of requiring a lot of 
secondarj storage. 3he size of the D/D, statistical and 
other information concerning tie frequency of using the DEN, 
and the amount of information included in the D/D, all will 
have an impact on the effectiveness of tbis archi tectur e. 

Tie second arciitecture which allocates replicated 
copies of the D/D tc selected nodes (the most active) is 
more conservative. In the case of a huge dictionarj, this 
saves a significant amount of secondary storage, tut 



requires heavier use cf the D£N. Here the £±^e of the C/D 
and the appropriate redes at which to install the replicated 
copies seriously affect the effectiveness of this 

architecture- 

Ihe third architecture is based on distribution cf tne 
r/D. Different views of the D/D reside in each IAN and 
contain information only ccncerning the local data base. 
Ihis architecture involves the use of a data directory (we 
propose twe replicated copies# one located in each ICf) . 
Ihe use cf the data directory (which contains limited irfer- 
aaticc) provides a hind of ''relation or connection" between 
the various views. Also a global dictionary is needed in 
erder tc provide consistency and global function facilities 
throughout the systei. This architecture is more dynamic 
than tie previous twe discussed so far. It has the advantage 
cf saving secondary storage but# on the other hand# 
increases even mere the use of the DDN. 

A fourth architecture was discussed just tc mention 
another possibility for a distributed architecture# tut cur 
estimaticn is that it would be too expensive in system 
resource consumption for the SDLICE. 

Three environmental dependency options for the IDS 
(independent# completely integrated# and DBMS dependent) 
were also discussed. The main reason for choosing the 
embedded (DBMS dependent) approach is because the data 
dictionary is geing to be used only for the SPLICI system 
(so tie independent approach does not make any sense) # and 
also the SPLICE data base already exists. Also the embedded 
approach (DBMS dependent) was chosen because of the hcmcce- 
neity cf the DBMS environments across lAN's. The indepen- 
dent and ccmpletely integrated approaches are too costly at 
this time although the latter could be implemented eventu- 
ally ficm an embedded environment. 
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E. EICCBflEliEATIClIS 



Iicii th€ investigations pezfcrmed, we have the fclicwing 
main leccnEenda tions ior the SPLICE system: 

a.- Ihe TANDEM data dictionary that already exists 
should te the basis fcr the SPIICE data dictionary. 

t.- Ihe D/D should he inplemented only fcr new applica- 
tions because it is a herculean task to retrofit the D/D to. 
the existing old applications. 

C-- The embedded (DBMS dependent) approacn should be 
used fcr the D/D. 

d. - Twc candidate architectures should be examired 
further hased on statistical and other information (not 
available fcr the present thesis) ; 

-Eeplicated architecture (Figure 2.3) with 
selection cf nodes where each copy will reside. 

-Distributed architecture (Figure 2.4) with the 
use of twc replicated copies of the data 
directory located at each ICP. 

e. - A EML processor should be used to interface between 
data dictionary and session services. 
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MEJNEIX A 

TAliEEM DATA DICTIONABY 



1 , Cverview 

3his appendix is iccluded to mention some features 
(hopefully the tost important) of TANDEM data dictionary/ 
since tie TANDEt DECS will he used in the SPLICE system. 
Eor a mere detailed description of the TANDEM D/C, see 
£fief, 13:, 

A data definition language (DDL) is a language used 
ty the data dictionary administrator to describe reccrc and 
file structures cf a database. After the description, the 
resulting source file is input to the DDL compiler, and the 
EDL ccjipiler can create data declaration source language for 
database records in three languages, COBOL, FOBTRAN, and 
TAL, The EDL compiler can also produce FUP (file utility 
program) file creation commands for database files. The 
most significant feature of DDL is its ability to create and 
maintain a data dictionary. The TANDEM data dictionary is a 
set cf seven files that documents the structure and location 
cf each file in a database. 

The DDL provides facilities for updating a 
dictionary as the database it describes grows and the struc- 
ture cf the database files changes. The DDL compiler and 
the dictionary it creates serve as a central pcint of 
control ever a database. 

TANCEM defines a database as a collection cf files 
structured to serve cne or more applications. When a list 
of DEI statements --a DDL source schema-- is given to the 
EDL compiler, the compiler can produce any of the following 
files ; 



♦ A data dictionary. 
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* A fUE*'-file creation command source. 

* A data declaration source for COBOL, 

FCflEAN, or TAI, 

* A schema report summarising each record's 
structure and each file's access keys. 

Ihe data dictionary produced by the CDL ccopil 
t of files that forms a permanent record of the dat 
ma. Ihus the database schema, stored as a se 
icnary files, becomes a system resource. The dicta 
s database managers information about each file i 
base and also shci«s how the files relate to each c 
r the dictionary has been created, the DDL compile 
the dictionary ard produce COBOL, FOLIBAN, or TAL 
araticn source for any record defined by the sc 
dictionary is also used hy ENFORM, TANDEM'S dat 
y language and report writer. 



2 • Cre ati n g a Dict iona ry 

Tie data dictionary files can be created on 
clume in the system. The subvolume that is tc cc 
data dictionary is specified with the DDL DICT cc 
example ?DICT 3FTCCKNC-CNTY ) . The DDL compiler 
tes the dictionary files cn the ^^uantity subvclu 
i SICCKNO volume, and then opens the files for acce 



- • Dictionary Re torts 



TANDEM provides DDL users with ENFORM source 
ve dictionary reports. The twelve reports documen 
the DEFINITICN and RECORD entries in the dictic 
riling cot only their structures, but how they rela 
ether as well. 

Cnee a schema describing a database has 
iled by the DDL compiler and a dictionary has 
uced, information about the database can easil 
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TABIE I 

Dictionary Beyort Summary 



Cuerj 

N off € 

E 1 

E2 

E3 

E4 

E5 

E6 

El 

B6 

ES 

E 10 

Ell 
E 12 



R epo rt des cri pticn 

DICTIONARY OBJZCIS- fil describes each ££f and 
EECORD in the dictionary, grvinu the tine aid 
date of creation, the time and date of the 
last modification, and the version numlei for 
each ot^ect. 

DEFINIUCN STEUCIURE- E2 lists all .of the 
conponent groups and fields for each DIE in 
the dictionary. 

EECORD SlfiUCTDEE- R3 lists all of the 
component groups and fields for each EECCEB 
in the cictioraxy. 

DEFINIIICNS USING DEFINITIONS- E4 shows 
which DIFs are referenced by other DIFs. 

The referencing EEFs are listed with each 
of its elements that references another 
DEF and the referenced DEF’s name. 

RECORDS USING DEFINITIONS- R5 shows which 
DEFs are referenced by RECORDS. Each RECORD 
is listed with each or its elements that 
references a DEF and the referenced DEf’s 
name. 

DEFINITIONS WHEEE USED- R6 lists each DEF 
that is referenced by another object, be it 
a DEF cr a RECCED. Tne referencing DEF cr 
RECORD is shown in each case. 

EECORD ACCESS- E7 lists the file name and 
access leys {bcth primary and alternate) for 
each RECORD, in the dictionary. 

EECORD DEFINITICS METHOD- R8 shows the methcd 
used tc define each RECORD. The source DEF 
is listed for these RECORDS defined with the 
DEF IS <def name> clause. 

REfOET EEADINGS- E9 lists all of the ENFCEM 
report headings declared for fieds and 
groups within each DEF and RECORD in the 
diet lorary. 

DISPIAY FORMATS- RIO lists all cf the ENFCEM 
display formats declared for fields and 
groups within each DEF and RECORD in the 
diet lonary. 

RECORD COMMENTS- Ell lists the comments that 
immediately preceded the defining RECCED 
statement for each RECORD in the dicticnary. 

DEFINITION COMMENTS- R12 lists the comments 
that immediately preceded the defining DEF 
statement for each DEF in the dictionary. 



ec 



TABU XI 



Dictiorary Mo dii ication Function 



C ceiat. /Ent . ty pe Procedure 



hlL/lll 

AII/JECCEB 

lElBIE/EEF 



Open dictionary with ?DICT and 
compile new DBF statement. 

Open dictionary with ?DICT and 
compile new RECORD statemen t 



Open dictionary with ?DICT. 
all dictionary entries that 
reference the DEF, and then 
the DEF itself with DELETE 



delete 

delete 



EEIEIE/EECOED Open dictionary with ?DICT 

and then delete the RECORD entry 
with the DELETE statement. 



MCDIFY/EEF Open dictionary with TDICT 

command, then delete all other 
RECORD and DEF entries that refe- 
rence the DEF, delete the DEF, 
recompile the edited DEE, and, 
finally, recompile the DEF and 
RECORD statements that 
reference the DEF, 



MCEIEY/iECORD 
(kith nc 
DEE changes) 



Open dictionary with 7DICT 
and recompile edited 
RECORD statement. 



ECDIEY/EECORE 
(kith DEE 
chances) 



Open 
and d 
the D 
modif 
to he 
recom 



dictionary 
elete the 
ELEIE stat 
y any DEF 
changed, 
pile the n 



with EDICT 
RECORD with 
ement. Then 
entries that 
and finally, 
ew record st 



need 

atemen 



t. 
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